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ABSTRACT 

Monitoring and control over the process of studying the distance learning course are 
based on solving the problem of making out an adequate integral mark to the educatee 
for mastering entire study course, by testing results. It is suggested to use the degree of 
correspondence between educatee's thesaurus and the study course thesaurus as an 
integral mark for the degree of mastering the distance learning course. Study course 
thesaurus is a set of the course objects with relations between them specified. The article 
considers metrics of the study course thesaurus complexity, made on the basis of the 
graph theory and the information theory. It is suggested to use the amount of 
information contained in the study course thesaurus graph as the metrics of the study 
course thesaurus complexity. Educatee's thesaurus is considered as an object of 
measuring educational material learned at the semantic level and is assessed on the 
basis of amount of information contained in its graph, taking into account the factors of 
learning the thesaurus objects. 

Keywords: e-learning, thesaurus, learning management system, thesaurus metrics, 
knowledge measurement, study course material, knowledge testing. 

INTRODUCTION 

Educational process presupposes purposeful influence on the educatee's thesaurus. 
Currently, there is no possibility of knowledge monitoring based on the degree of 
correspondence between educatee's thesaurus and the study course thesaurus in 
distance learning systems. In distance learning system, educational process consists of 
sequence of the cycles of providing educatee with educational material and learning the 
educational material by the educatee. The cycle of learning certain educational material 
by educatee results in expansion of the educatee's thesaurus. Definition of the 
"individual's professional thesaurus" concept is given in the I.R.Abdulmyanov's work 
(2010). Study course thesaurus is a set of the study course objects (concepts, laws, 
theorems, statements, etc.) with relations between them specified. 

Educatee's thesaurus is an object of measuring educational material learned at the 
semantic level. Let us assume that a distance learning system with the study course 
described by the thesaurus I d provides educatee with an educational material described 

by the thesaurus I s . Possibility of learning the educational material, described by the 
thesaurus I f £ I d , by the educatee can be as follows: 
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1) If If c I s then there will be no changes in the educatee's thesaurus during the 
education since this information is already known to the educatee. 

2) If If f |/ 5 0 and I s c 1, then the educational material can be learned by the educatee 

if desired, and as a result educatee's thesaurus will be expanded. 

The educatee acquires a maximum amount of semantic information when their thesaurus 
is coordinated with the study course material's thesaurus i.e. if the educational material 
is understandable to the educatee and carries an information, which is absent in their 
thesaurus. 

Thesaurus presentation of the educational material and also of the current educatee's 
state of knowledge ensures adaptive selection and ordering of the educational 
information. Process of the thesaurus forming based on using the knowledge 
presentation methods is described in detail in the S.Bechhofer's and C.Goble's work 
(2001). Understanding of not only thesaurus object's attributes, but also relations of the 
object with other objects are characteristic of the process (D.Soergel, B.Lauser, A.Liang, 
F.Fisseha, J.Keizer, S.Katz, 2004). 

Metrics described in the works by D.Bonchev and G.A.Buck (2005) and A.Gangemi, 
C.Catenacci, M.Ciaramita and J.Lehmann (2005) can be used for quantitative assessment 
of complexity of the thesaurus presented in the form of a graph. Metrics used for 
ontologies can be used for comparative analysis of thesauruses since thesauruses can be 
considered as ontology types. But to use the ontologies comparison metrics, described in 
the works by A. Loza no-Tel lo and A.Gomez-Perez (2004) and A.Maedche and S.Staab 
(2002), for comparative analysis of the study course thesaurus and educatee's thesaurus, 
the metrics must be improved since the result of comparison of the educatee's thesaurus 
and the study course thesaurus must be a mark describing not only correspondence 
between their structures, but also the degree of mastering the study course. 

In distance learning systems, degree of mastering the study course is assessed by the 
results of educatees testing (J.Myrick, 2010). Currently, much attention is given to 
increasing accuracy of assessing results of education in distance learning systems. For 
this purpose, A.A.Rybanov's work (2013) suggests taking into account the process of 
forming final answer to test items by the user, and the work by K.Scalise and B.Gifford 
(2006) suggests innovative test item forms for computer-aided knowledge testing. 
Integral mark for quality of mastering the distance learning course is calculated on the 
basis of educatee's marks for all tests within the study course. For example, the Moodle 
system has the following approaches to calculation of integral mark for quality of 
mastering study course (S.S.Nash, W.Rice, 2010): "mean of grades", "weighted mean of 
grades", "simple weighted mean of grades", "mean of grades" (with extra credits), 
"median of grades", "lowest of grades", "highest of grades", "mode of grades", "sum of 
grades". Among all the approaches, only the "weighted mean of grades" takes into 
account complexity of learning an educational module by determining the weight factor 
for the test associated with the module. There is a problem of determining weight factors 
of educational modules within the distance learning course. Determining the factors by 
the subjective weighing method (i.e. the factors are determined by the author of the 
distance learning course) results in error in the final mark value. Thesaurus presentation 
of the study course will allow determining weight of each thesaurus object more soundly 
and objectively. Weights of the tests can be determined by comparing test items with 
thesaurus objects within the study course. 
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Thesaurus objects, which are difficult for learning, can be identified on the basis of 
thesaurus presentation of the educational material and comparative analysis of the 
educatees testing results. Set of such thesaurus objects can be used for more well- 
grounded strategy of correcting educational material and tests. 

More precise learning curves can be constructed by using degree of correspondence 
between educatee's thesaurus and the study course thesaurus as a learning 
achievements metric (Figure: 1). Learning curves are a basis for classification of 
educatees into extroverts and introverts: introverted subjects have a concave learning 
curve that is caused by a long phase of latent accumulation of knowledge and skills. 



Figure: 1 

Dynamics of changing learning achievements during education. 

All above mentioned directions of monitoring and control over the process of studying 
within the distance learning course are based on solving the problem of making out an 
adequate integral mark to the educatee for mastering entire study course, by testing 
results. This problem can be solved by measuring degree of correspondence between 
educatee's thesaurus and the study course thesaurus. 

MATHEMATICAL DESCRIPTION 

Model of the distance learning course thesaurus 

Thesaurus describing the system of the study course objects can be presented in the 
form of an oriented graph G = (y,E)} where V is a set of vertexes (study course 
thesaurus objects), £ is a set of arcs (oriented edges describing the logic of studying the 
study course objects). Let us introduce the following symbols: n= \ V\, m= \ E |. Let us 

consider the set E of the logical relations between the study course thesaurus objects. 
Let us assume that (y^vj) ■ E if v, is a direct semantic component of Vj . Let us also 

assume that A is an adjacency matrix of the study course thesaurus graph G , where the 
matrix element = 1 if (v,,v } ) : E, and otherwise a l} = 0 . Then A L is a matrix showing 
quantity of the paths with the length L which are between any two objects Vj and Vj . 
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The quantity of these paths is determined by the figure on intersection of / th line and 
j th column of the matrix A L . Let us designate an element of the matrix A L as ajf ) . 


Then: 


- a 

d ij ~ d ij ■ 


»(*+!) - V/), 

d ii ~ t-Pik d ki ■ 


( 1 ) 

( 2 ) 


The graph G describing the study course thesaurus must meet the following 
requirements: 

1) There must be no isolated vertexes in the study course thesaurus graph: 

2>r> + S#" 1 ’ , v*=v;. (3) 

/=1 7=1 

2) There must be no circuits in the study course thesaurus graph, i.e. any matrix A L must 
meet the following condition: 

*i. (4) 

k =1 

3) There must be no duplicate connections between vertexes of the study course graph, 

i.e. if there are arcs ( Vj,v k ) and (i/, ,v k ), the arc (i/,, v k ) can be removed as it, 

according to the transitivity property, duplicates requirements to the sequence of 
studying the thesaurus objects v,■ and v k . 


Let us assume that the entrance study course thesaurus objects are all objects v k which 
meet the following condition: 

s»r>=o. is) 

/=i 

Let us also assume that the exit study course thesaurus objects are all objects v k which 
meet the following condition: 

IH i=1) =0. (6) 

7=1 

When analyzing the subject matter thesaurus, it is important to know what objects are 
used for forming other objects, and what these other objects are. To describe relative 
duration of forming the study course thesaurus objects, the reachability matrix D is 
used: 

N 

D=JjA l . (7) 

7=1 

Here d i} is an element of the matrix D which shows in what quantity of the cycles after 
the object v, the object Vj will be formed; N is the order of the study course thesaurus 

graph: A N ^0, A N+1 =0. 
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Metrics Of Complexity Of The Distance Learning Course 
Thesaurus On The Basis Of The Graph Theory 

To describe characteristics of the study course thesaurus presented in the form of graph 
G , the following graph metrics can be used: 

1) Order of the study course thesaurus graph: n{G) = n . 

2) Size of the study course thesaurus graph: s(G ) = m . 

3) Diameter of the study course thesaurus (length of the maximum path between the 
entrance objects v and the exit objects v ; of the thesaurus, expressed by a number of 

the arcs, which make this path): 

diam{G ) = max d H . (8) 

d :j & D J 

4) Structural redundancy R(G) of the study course thesaurus graph shows excess of the 
total quantity of connections between vertexes of the graph G over the minimum 
quantity of connections: 

R{G) = -^-l. (9) 

5) Edge density Q(G ) (characterizes proximity of the graph G to the fully connected 
graph): 

= >■ < l0 > 

6) Absolute depth of the graph H'[G ) (A.Gangemi, C.Catenacci, M.Ciaramita, J.Lehmann, 
2005): 

m 

H\G) = lA/ Jep . (11) 

j 

Here N j<EP is the length of the j th path belonging to the set of all paths P in the graph 
G - 

7) Average depth of the graph h(G) (A.Gangemi, C.Catenacci, M.Ciaramita, J.Lehmann, 
2005): 

1 n 

h{G) = — lAf Jep . (12) 

j 

Quantitative characteristics of the study course thesaurus objects can be described by 
the following metrics: 

1) Let us define the weight of the study course thesaurus object associated with the 
vertex v k as a quantity of all paths passing through the vertex v k : 

n n 

w k - 'Zldik + ! (13) 

/=1 j =1 

Here d j} is an element of the reachability matrix D , which shows how many paths, 
irrespective of their lengths, there are between the vertexes v, and v } . 

2) Rank of the object Vj of the study course thesaurus (equal to quantity of the arcs 

entering the maximum length path in the graph G , from the entrance study course 
thesaurus object to the object Vj ): 
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(14) 


Pj = L at l^af > 0 and Ea^ +1) = 0. 

/=i /=i 

When ranks of all study course objects are determined, it is possible to construct the 
study course thesaurus graph ordered by cycles. 

3) Degree of the study course thesaurus object is determined by summing up in-degree 
and out-degree of vertex v k associated with the thesaurus object: 

n n 

a k ~ S' 5 # + '^ a ik ■ (15) 

7=1 /=1 

The metrics presented above allow assessing topological complexity of the study course 
thesaurus graph and give an idea about complexity of learning the distance learning 
course. 

Metrics of Complexity of the Distance Learning Course 
ThesaurusoOn the Basis of The Information Theory 

Let us describe the metrics of complexity of the study course thesaurus on the basis of 
the Shannon's information theory (C. E. Shannon, W. Weaver, 1949). According to the 
information theory, informational entropy H{o) of a message of N symbols divided, 
according to some criterion, into k groups of N l , N 2 ,..., N k symbols is calculated by the 
following formula: 



Here p, = is probability of presence of the / th group symbols in the message. 


Study course thesaurus graph is specified by a final set of elements (vertexes, edges, 
arcs, cliques, etc.). Let us assume that N is a quantity of the study course thesaurus 
graph's elements. Weight of each study course thesaurus graph's element is w ,, / = 1,/V. 


Let us determine the total weight of the study course thesaurus graph by the following 
expression: 

N 

W= Ijv, ■ (17) 

/=l 

Probability of presence of / th element with weight w i in the study course thesaurus 
graph is calculated as follows: 



(18) 


Thus probability scheme of the study course thesaurus graph can be described by Table: 

1 . 


Table: 1 

Probability scheme of the study course thesaurus graph 


Element 

1 

2 

■ ■ ■ 

N 

Weight 

w 1 

w 2 

... 

w N 

Probability 

A 

A 

... 

Pn 





Entropy of the study course thesaurus graph with total weight W and weights of the 
elements w ,, / = 1,/V for the specified probability scheme (Table: 1) is determined by the 
following expression: 

N N N 1 N 

H = - X^ log 2 log 2 w'/ + Xj^- log 2 tv = log 2 W - — Xv,- log 2 . (19) 

/'=1 /=1 /=1 /'=1 

According to the Shannon's information theory, amount of information is defined as 
decrease in the system entropy relative to the maximum entropy, which can exist in the 
system with the same quantity of elements: 

I = H msx -H. (20) 

Informational entropy of the study course thesaurus graph possesses the maximum 
value when w,- = 1 (Formula 19) and is determined as follows: 

H max = \og 2 W. (21) 

Thus expression for determining amount of information contained in the study course 
thesaurus graph takes the following form: 

1 N 

I=yy Xv / log 2 M/ / . (22) 

/=1 

This expression is the metrics of complexity of the study course thesaurus and can be 
used for assessing degree of correspondence between the educatee's thesaurus graph 
and the study course thesaurus graph. 

Educatee's thesaurus model 

Let us define educatee's thesaurus graph G' = ( U,E') as a subgraph of the set of vertexes 
of the study course thesaurus graph G = (I/,£); where U Cl/ and E' consists of all those 
arcs of the graph G whose both ends belong to U . Each vertex of the graph G' is 
associated with a learned object of the distance learning course and, as a quantitative 
characteristic, is described by degree of mastering \ k [0;1] the educational material 

connected with the concept u k . 

Let us present dynamics of the process of studying the educational material described by 
the thesaurus G = iy,E) as a final ordered sequence of the educatee's thesaurus graphs: 

p = {G[,... l G' i ,... l G' r } l 

G] f| Gj +1 = G'i , / = 1, r - 1, 

G' r nG = G' r , 

A/+i = n[G' i+l )- n[G'i). 

Here G] is a subgraph of the set of vertexes of the graph G; +u U, ^U i+1 ; G’ r is a 
subgraph of the set of vertexes of the graph G' , U r <GV ; A /+1 is a quantity of new objects 
with which the educatee's thesaurus graph Gj has been expanded. The set p describes 

the process of changing the educatee's thesaurus, connected with learning new objects 
of the study course thesaurus. During learning the study course, there is an expansion of 
conceptual base of the educatee's thesaurus which leads to increase in relations between 
the concepts. Let us determine the weight of an object in the educatee's thesaurus, 
associated with vertex u k , as product of the degree of mastering \ k [0;1] the study 

course thesaurus object by the educatee and weight of this object in the study course 
thesaurus graph: i o 



w'k = K kWk ■ (23) 

Metrics of the study course thesaurus can be used as metrics of complexity of the 
educatee's thesaurus. 

Degree of correspondence between the educatee's thesaurus graph G] and the 

study course thesaurus graph G' can be determined as follows: 

5(S;) = ^^* 100% , (24) 

C' 

Here I'(G') is amount of information in the educatee's thesaurus graph > which is 
calculated by the formula (22). 

RESULTS AND DISCUSSION 

To analyze the metrics suggested in this paper (Formulas 8-12, 22, 24), the experiment 
has been carried out in which the process of studying educational material has been 
modeled. Educational material thesaurus graph presented in Figure: 2 consists of 50 
objects. Entrance objects of the educational material thesaurus are the objects 1, 4, 12, 
and 44. The educational material thesaurus graph has the following values of the 
metrics: diam(G) = Z, n(G) = 50, R(G) = .020, Q(G) = . 041, H'(G) = 809 , h(G) = 3.487, 

1(G) = 1796.002. 



Figure: 2 

Educational material thesaurus graph G 
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The purpose of the experiment was tracking the changes in the metric characteristics of 
the educatee's thesaurus graph formed during studying the educational material. 

Initial data for carrying out the experiment are as follows: 


Dynamics of the process of studying the educational material presented with the 
thesaurus graph G is described by the following sequence of changing the educatee's 
thesaurus graph (See Appendix): 

9 = {Gi&,Gi,G'4,Gi,Gi,Gl,G{ l ,Gl,G{ Q }. 

Metrics of the educatee's thesaurus graphs are shown in Table: 2. 


Table: 2 

Metrics of the educatee's thesaurus graphs 


Educatee's 
thesaurus graph 

diam{Gj ) 

n(G') 

R(G') 

Q(G') 

H\G)) 

h(G') 

G[ 

1 

5 

0 

.400 

4 

1.000 

G{ 

3 

10 

0 

.200 

26 

1.625 

Gi 

3 

15 

0 

.133 

39 

1.560 

G\ 

4 

20 

0 

.100 

77 

1.833 

Gi 

5 

25 

0 

.080 

157 

2.309 

G’g 

5 

30 

0 

.067 

204 

2.345 

Gi 

7 

35 

0 

.057 

324 

2.723 

Gg 

7 

40 

0 

.050 

430 

2.886 

Gi 

8 

45 

0 

.044 

499 

2.970 

G{ 0 

8 

50 

.020 

.041 

809 

3.487 



Educatee's thesaurus graph 


/KG}) 

— H\G;) 

— W) 


Figure: 3 

Dynamics of changing the degree of correspondence between the metrics n(G'), Q(G'), 
H'(G'), h(Gj ) and similar metrics of the educational material thesaurus graph 
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The metrics based on the graph theory (Formulas 8-12) do not take into account the 
degree of mastering the objects of the educational material thesaurus by the educatee 
and therefore will have identical values for identical in topology thesaurus graphs of 
various educatees. Besides, the process of expanding conceptual base of the educatee's 
thesaurus not always means changing values of the metrics diam^G )'), R{G-) and increase 

in the metrics h{G-) (for example, educatee's thesaurus graphs G[ and G k ). The metrics 
n{G-) and <?(£/) do not take into account the weight of the objects of the educational 

material thesaurus; therefore the degree of correspondence between these metrics for 
educatee's thesaurus and educational material thesaurus depends linearly on quantity of 
the objects with which the conceptual base of the educatee's thesaurus is expanded. 

The metrics H'{G' i ') takes into account the weights of the educational material thesaurus 
objects and can be used only for assessing complexity of the study course thesaurus. 

> Degree of mastering K k the educational material connected with the concept 
u k of the study course thesaurus graph G : K k e[0; 1], \/k = \,n . The quantity 
K k is described by the following concept mastering categories: unsatisfactorily 
- [0; .61); satisfactory - [.61; .76); good - [.76; .90); excellent - [.90; 1]. Since in 

case of unsatisfactory mastering the study course concept all the concepts 
arising out of it cannot be mastered, the quantity K k changed within the 

interval K k e[.61;l]. 

> Expansion of conceptual base of the educatee's thesaurus occurs with 
increments A = 5 concepts. 

Results of changing the metrics /(<?/) are shown in Table: 3, and the correspondence 
degrees 5(<S/) are shown in Table: 4. 

Table: 3 

Values of the metric /(<£/) (bits) 


Educatee's 
thesaurus graph 



^k 




[.61;.76) 

[.61;.90) 

[.76; .90) 

[.76; 1] 

[.90; 1] 

i 

G[ 

1.675 

2.208 

3.103 

5.009 

5.917 

6.755 

G{ 

28.339 

33.437 

38.247 

47.923 

52.808 

58.894 

Gi 

46.415 

57.029 

69.461 

79.970 

91.289 

97.400 

G 4 

127.073 

166.283 

167.941 

191.029 

220.562 

240.865 

Gs 

265.555 

276.504 

349.899 

379.486 

425.265 

465.161 

Ge 

337.307 

389.449 

443.160 

481.549 

540.448 

589.141 

Gj 

459.584 

551.091 

637.198 

725.756 

767.073 

830.528 

Gs 

610.144 

690.862 

798.965 

922.250 

996.544 

1068.03 

Gg 

677.172 

809.849 

922.813 

1001.127 

1121.270 

1208.266 

Glo 

1005.884 

1221.713 

1353.899 

1457.382 

1667.390 

1796.002 
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Table: 4 

Values of the correspondence degree 5 (g;) (%) 


Educatee's 
thesaurus graph 







[.61; .76) 

[.61;.90) 

[.76; .90) 

[-76; 1] 

[-90; 1] 

i 

G{ 

.093 

.123 

.173 

.279 

.329 

.376 

G[ 

1.578 

1.862 

2.130 

2.668 

2.940 

3.279 

Gi 

2.584 

3.175 

3.868 

4.453 

5.083 

5.423 

G’, 

7.075 

9.258 

9.351 

10.636 

12.281 

13.411 

Gi 

14.786 

15.396 

19.482 

21.129 

23.678 

25.900 

G'i, 

18.781 

21.684 

24.675 

26.812 

30.092 

32.803 

Gj 

25.589 

30.684 

35.479 

40.410 

42.710 

46.243 

Gg 

33.972 

38.467 

44.486 

51.530 

55.487 

59.467 

Gg 

37.704 

45.092 

51.382 

55.742 

62.431 

67.275 

G{ 0 

56.007 

68.024 

75.384 

81.146 

92.839 

100 


Comparative analysis of the experimental data (Figure: 4) has shown that: 

> When the degree of mastering the objects of the educational material 
thesaurus increases, the degree of correspondence 5 (Gj) between the 

educatee's thesaurus graph Gj and the educational material thesaurus 
graph G increases too. 

> Increase in complexity of topology of the graph Gj caused by expanding 

conceptual base of the educatee's thesaurus leads to increase in the 
degree of correspondence 5(Gf) between the educatee's thesaurus graph 
Gj and the educational material thesaurus graph G . 

> The greater the weight of the study course thesaurus objects learned by 
the educatee, the greater the increment A5. 



A, <= [.61; .76) 
-S- A, e [.61; .90) 
-A- A* <e [.76;.90) 
A, <= [.76;1] 
-SK- A, e [.90;1] 

A, =1 


Figure: 4 

Dynamics of change 5 {Gj) during expanding the educatee's thesaurus. 
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These facts allow to conclude that the metric 5 (Gj) constructed according to the 

information theory is an objective assessment of the degree of mastering the distance 
learning course. 

CONCLUSION 

Comparative analysis of metrics of the educational material thesaurus graph has shown 
that the metric H'{G) (Formula 11) or metric 1(G) (Formula 22) can be used for 

assessing complexity of the distance learning courses. To assess the degree of 
correspondence between the educational material thesaurus graph and the educatee's 
thesaurus graph, it is recommended to use the metric 5((S'/) (Formula 24) constructed on 

the basis of the semantic information amount measure since it takes into account the 
degree of mastering the educational material thesaurus objects by the educatee. The 
suggested metrics can be used for monitoring the distance learning process. 
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